Summarization of Spoken Language — Challenges, Methods, and Prospects

نویسنده

  • Klaus Zechner
چکیده

While the field of summarizing written texts has been explored for many decades, gaining significantly increased attention in the last five to ten years, summarization of spoken language is a comparatively recent research area. As the amount of spoken audio databases is growing rapidly, however, we predict that the need for high quality summarization of information contained in this medium will rise substantially. Summarization of spoken language may also aid the archiving, indexing, and retrieval of various records of oral communication, such as corporate meetings, sales interactions, or customer support. The purpose of this paper is to place summarization of spoken language in the context of general summarization research, describe its main challenges which are added on top of the already challenging area of written text summarization, describe past and current approaches and systems, and finally provide a tentative outlook on future directions in research and development of spoken language summarization systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Positional language modeling for extractive broadcast news speech summarization

Extractive summarization, with the intention of automatically selecting a set of representative sentences from a text (or spoken) document so as to concisely express the most important theme of the document, has been an active area of experimentation and development. A recent trend of research is to employ the language modeling (LM) approach for important sentence selection, which has proven to...

متن کامل

Problems and Prospects in Collection of Spoken Language Data

In this paper, we focus on the information in speech data and discuss the research issues involved in collecting, organizing, indexing, retrieving and summarization of speech data. We share our experience about the problems and prospects in collection of spoken language data. We highlight some of the procedures, standards that need to be adapted in collecting the speech data, and discuss our pl...

متن کامل

Enhanced language modeling for extractive speech summarization with sentence relatedness information

Extractive summarization is intended to automatically select a set of representative sentences from a text or spoken document that can concisely express the most important topics of the document. Language modeling (LM) has been proven to be a promising framework for performing extractive summarization in an unsupervised manner. However, there remain two fundamental challenges facing existing LM...

متن کامل

Transcribing human-directed speech for spoken language processing

As storage costs drop and bandwidth increases, there has been a rapid growth of spoken information available via the web or in online archives, raising problems of document retrieval, information extraction, summarization and translation for spoken language. While there is a long tradition of research in these technologies for text, new challenges arise when moving from written to spoken langua...

متن کامل

Extractive Spoken Document Summarization with Representation Learning Techniques

The rapidly increasing availability of multimedia associated with spoken documents on the Internet has prompted automatic spoken document summarization to be an important research subject. Thus far, the majority of existing work has focused on extractive spoken document summarization, which selects salient sentences from an original spoken document according to a target summarization ratio and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002